Each of these pages provides an analysis run through for a different type of design. Each document is structured in the same way:
Please note that there will be only minimal explanation of the steps undertaken here, as these pages are intended as example analyses rather than additional labs readings. Please also be aware that there are many decisions to be made throughout conducting analyses, and it may be the case that you disagree with some of the choices we make here. As always with these things, it is how we justify our choices that is important. We warmly welcome any feedback and suggestions to improve these examples: please email ug.ppls.stats@ed.ac.uk.
The idea behind “repeated measures” is that the same variable is measured on the same set of subjects over two or more time periods or under different conditions.
The data used for this worked example are simulated to represent data from 50 participants, each measured at 3 different time-points on an outcome variable of interest. This is a fairly simple design, leading from a question such as “how does [dependent variable] change over time?” You might easily think of the 3 time points as 3 different experimental conditions instead (condition1, condition2, condition3) and ask “how does [dv] change over depending on [independent variable]?”
This is a very simple way to simulate repeated measures data structure (with long data). There are a good number of other approaches, but this will do for now as you may well be familiar with all the functions involved:
set.seed(347)
library(tidyverse)
simRPT <- tibble(
pid = factor(rep(paste("ID", 1:50, sep=""),each=3)),
ppt_int = rep(rnorm(50,0,10),each=3), # add some participantz random-ness
dv = rnorm(150,c(40,50,70),sd=10) + ppt_int,
iv = factor(rep(c("T1", "T2", "T3"), each=1, 50))
)
If you are unclear about any section of the code above, why not try running small bits of it in your console to see what it is doing?
For instance, try running:
paste("ID", 1:50, sep="")
rep(paste("ID", 1:50, sep=""),each=3)
factor(rep(paste("ID", 1:50, sep=""),each=3))
Because we simulated our data, it is already nice and tidy. Each observation is a row, and we have variable indicating participant id (pid
).
head(simRPT)
## # A tibble: 6 × 4
## pid ppt_int dv iv
## <fct> <dbl> <dbl> <fct>
## 1 ID1 -1.92 41.6 T1
## 2 ID1 -1.92 40.2 T2
## 3 ID1 -1.92 63.6 T3
## 4 ID2 -11.1 22.9 T1
## 5 ID2 -11.1 40.5 T2
## 6 ID2 -11.1 60.6 T3
Let’ see our summaries per time-point:
sumRPT <-
simRPT %>%
group_by(iv) %>%
summarise(
n = n_distinct(pid),
mean.dv = round(mean(dv, na.rm=T),2),
sd.dv = round(sd(dv, na.rm=T),2)
)
sumRPT
## # A tibble: 3 × 4
## iv n mean.dv sd.dv
## <fct> <int> <dbl> <dbl>
## 1 T1 50 38.8 13.1
## 2 T2 50 50.8 11.2
## 3 T3 50 67.8 13.5
We can make this a little prettier:
library(knitr)
library(kableExtra)
kable(sumRPT) %>%
kable_styling("striped")
iv | n | mean.dv | sd.dv |
---|---|---|---|
T1 | 50 | 38.84 | 13.10 |
T2 | 50 | 50.82 | 11.16 |
T3 | 50 | 67.81 | 13.48 |
Well…we knew what the answer was going to be, but there we have it, our scores improve across the three administrations of our test.
We can construct some simple plots showing distribution of the outcome variable at each level of the independent variable (iv):
simRPT %>%
ggplot(aes(x = iv, y = dv)) +
geom_violin() +
geom_jitter(alpha=.5,width=.1,height=0) +
labs(x="Time", y = "Test Score",
title="Scores across trials",
subtitle = "Violin Plots with (jittered) Observations")+
theme_minimal()
So what does this show? Essentially we are plotting all responses at each point in time. The points are jittered
so that they are not all overlaid on one another. The areas marked at each time point are mirrored density plots (i.e. they show the distribution of the scores at each point in time).
If you want to get an intuitive sense of these plotted areas, look at them against the mean’s and sd’s per time point calculated above.
simRPT %>%
ggplot(aes(as.numeric(iv), dv)) +
stat_summary(fun.data = mean_cl_boot, geom="ribbon", alpha=.3) +
stat_summary(fun.y = mean, geom="line") +
labs(x="Time", y = "Test Score",
title="Scores across trials",
subtitle = "Mean and Boostrapped 95% Confidence Intervals")
We can also show each participants’ trajectory over time, by using the group
aesthetic mapping.
simRPT %>%
ggplot(aes(x = iv, y = dv)) +
geom_point(size=3, alpha=.4)+
geom_line(aes(group=pid), alpha = .2) +
theme_minimal()
We’re going to fit this model, and examine the change in dv
associated with moving from time-point 1 to each subsequent time-point.
Recall that because iv
is categorical with 3 levels, we’re going to be estimating 2 (\(3-1\)) coefficients. \[
\begin{aligned}
\operatorname{dv}_{i[j]} &= \beta_{0i} + \beta_1(\operatorname{iv}_{\operatorname{T2}_j}) + \beta_2(\operatorname{iv}_{\operatorname{T3}_j}) + \varepsilon_{i[j]} \\
\beta_{0i} &= \gamma_{00} + \zeta_{0i} \\
& \text{for }\operatorname{pid}\text{ i = 1,} \dots \text{,I}
\end{aligned}
\]
library(lme4)
Here we run an empty model so that we have something to compare our model which includes our iv. Other than to give us a reference model, we do not have a huge amount of interest in this.
m0 <- lmer(dv ~ 1 + (1 | pid), data = simRPT)
Next, we specify our model. Here we include a fixed effect of our predictor (group membership, iv
), and a random effect of participant (iv
) to take account of the fact we have three measurements per person.
m1 <- lmer(dv ~ 1 + iv + (1 | pid), data = simRPT)
summary(m1)
## Linear mixed model fit by REML ['lmerMod']
## Formula: dv ~ 1 + iv + (1 | pid)
## Data: simRPT
##
## REML criterion at convergence: 1144.7
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -2.52990 -0.57636 -0.02015 0.56073 2.50816
##
## Random effects:
## Groups Name Variance Std.Dev.
## pid (Intercept) 74.66 8.641
## Residual 84.60 9.198
## Number of obs: 150, groups: pid, 50
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 38.842 1.785 21.76
## ivT2 11.976 1.840 6.51
## ivT3 28.964 1.840 15.74
##
## Correlation of Fixed Effects:
## (Intr) ivT2
## ivT2 -0.515
## ivT3 -0.515 0.500
And we can compare our models.
library(pbkrtest)
PBmodcomp(m1, m0)
## Bootstrap test; time: 18.87 sec; samples: 1000; extremes: 0;
## large : dv ~ 1 + iv + (1 | pid)
## dv ~ 1 + (1 | pid)
## stat df p.value
## LRT 126.84 2 < 2.2e-16 ***
## PBtest 126.84 0.000999 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
anova(m0,m1)
## Data: simRPT
## Models:
## m0: dv ~ 1 + (1 | pid)
## m1: dv ~ 1 + iv + (1 | pid)
## npar AIC BIC logLik deviance Chisq Df Pr(>Chisq)
## m0 3 1285.9 1294.9 -639.94 1279.9
## m1 5 1163.0 1178.1 -576.52 1153.0 126.83 2 < 2.2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
OK, so we can see that we appear to have a significant effect of our repeated factor here. Our parametric bootstrap LRT is in agreement here.
Comparison to aov()
Using anova()
to compare multilevel models will not give you a typical ANOVA output.
For piece of mind, it can be useful to compare how we might do this in aov()
m2 <- aov(dv ~ iv + Error(pid), data = simRPT)
Here the term Error(pid)
is specifying the within person error, or residual. This is what we are doing with our random effect (1 | pid)
in lmer()
And we can compare the model sums of squares from both approaches to see the equivalence:
summary(m2)
##
## Error: pid
## Df Sum Sq Mean Sq F value Pr(>F)
## Residuals 49 15121 308.6
##
## Error: Within
## Df Sum Sq Mean Sq F value Pr(>F)
## iv 2 21183 10591 125.2 <2e-16 ***
## Residuals 98 8291 85
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
anova(m1)
## Analysis of Variance Table
## npar Sum Sq Mean Sq F value
## iv 2 21183 10591 125.19
The residuals look reasonably normally distributed, and there seems to be fairly constant variance across the linear predictor. We might be a little concerned about the potential tails of the plot below, at which residuals don’t appear to have a mean of zero
plot(m1, type = c("p","smooth"))
library(lattice)
qqmath(m1)
Random effects are (roughly) normally distributed:
rans <- as.data.frame(ranef(m1)$pid)
ggplot(rans, aes(sample = `(Intercept)`)) +
stat_qq() + stat_qq_line() +
labs(title="random intercept")
library(sjPlot)
plot_model(m1, type="pred")
## $iv
plot_model(m1, type="re") # an alternative: dotplot.ranef.mer(ranef(m1))
Parametric bootstrap 95% CIs:
confint(m1, method = "boot")
## 2.5 % 97.5 %
## .sig01 5.999069 11.24815
## .sigma 7.941403 10.46011
## (Intercept) 35.397974 42.37131
## ivT2 8.313544 15.62274
## ivT3 25.286894 32.52241
Case-resample bootstrap 95% CIs:
library(lmeresampler)
bootmodel <- bootstrap(m1, fixef, type = "case", B = 999, resample = c(TRUE,FALSE))
confint(bootmodel, type = "perc")
## # A tibble: 3 × 6
## term estimate lower upper type level
## <chr> <dbl> <dbl> <dbl> <chr> <dbl>
## 1 (Intercept) 38.8 35.8 42.0 perc 0.95
## 2 ivT2 12.0 8.44 15.5 perc 0.95
## 3 ivT3 29.0 25.5 32.6 perc 0.95
res <- confint(bootmodel, type="perc")
res <- res %>% mutate_if(is.numeric,~round(.,2))
res
## # A tibble: 3 × 6
## term estimate lower upper type level
## <chr> <dbl> <dbl> <dbl> <chr> <dbl>
## 1 (Intercept) 38.8 35.8 42.0 perc 0.95
## 2 ivT2 12.0 8.44 15.5 perc 0.95
## 3 ivT3 29.0 25.5 32.6 perc 0.95
Writing up an interpretation of this is a bit clunky as we have abstract names for our variables like “dv” and “iv”, but as a rough starting point:
Change in [dv] over [iv] was modeled using a linear mixed effects model, with a fixed effect of [iv] and a by-[pid] random intercepts. At baseline, scores on [dv] were estimated at 38.84 (cluster-resample bootstrap 95% CI: 35.79–42.01). Results indicated that, relative to time-point 1, scores at time-point 2 increased by 11.98 (8.44–15.54), and at time-point 3 had increased by 28.96 (25.48–32.6).